Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] DDPG compatibility with compile #2555

Merged
merged 47 commits into from
Dec 14, 2024
Merged

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Nov 12, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2555

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 9 New Failures, 9 Unrelated Failures

As of commit 10093ab with merge base e2be42e (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 9d88c127091ad6e711ef49e9f48aaedad15b1c05
Pull Request resolved: #2555
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 12, 2024
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 25f076bfb94d0a22dc6f9222faf893e515dc9291
Pull Request resolved: #2555
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 4d13215ad9c4fc404b9c3ba76e55f5548ffbd6c0
Pull Request resolved: #2555
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: e4d7e1c0e928cc79c9630fb452d720ed730e6bba
Pull Request resolved: #2555
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 784457d870658d7fc36a1b36c5f1fe56af71e7a4
Pull Request resolved: #2555
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 4eacd3c1d004f106488170219fb293f8869cdb52
Pull Request resolved: #2555
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: dde505dac2ee16f4e57ada9f1091b2882dbec8fd
Pull Request resolved: #2555
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 5b4f0e90f4aa7ded0128f3729f30cbc69e3e22fa
Pull Request resolved: #2555
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 22a2e1b61404ebe8568c6add4754bca8282e9121
Pull Request resolved: #2555
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: f8cf9a97569a7fe5a110ac3797d177c71be2dbbe
Pull Request resolved: #2555
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 57c671de6604e4e3c681e1cb994a09578d02ca4c
Pull Request resolved: #2555
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 92a6e4e8f6d3af231fb9a546e6fbe85d5d22cef9
Pull Request resolved: #2555
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4383s 0.4292s 2.3299 Ops/s 2.2237 Ops/s $\color{#35bf28}+4.78\%$
test_transformed 0.5996s 0.5979s 1.6725 Ops/s 1.5932 Ops/s $\color{#35bf28}+4.98\%$
test_serial 1.3400s 1.3384s 0.7472 Ops/s 0.7316 Ops/s $\color{#35bf28}+2.13\%$
test_parallel 1.4087s 1.3132s 0.7615 Ops/s 0.7491 Ops/s $\color{#35bf28}+1.65\%$
test_step_mdp_speed[True-True-True-True-True] 0.1993ms 29.5873μs 33.7982 KOps/s 33.8871 KOps/s $\color{#d91a1a}-0.26\%$
test_step_mdp_speed[True-True-True-True-False] 79.3510μs 17.4406μs 57.3374 KOps/s 57.4630 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[True-True-True-False-True] 66.6040μs 16.6339μs 60.1182 KOps/s 60.1641 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[True-True-True-False-False] 32.4410μs 9.7745μs 102.3068 KOps/s 100.8531 KOps/s $\color{#35bf28}+1.44\%$
test_step_mdp_speed[True-True-False-True-True] 85.8220μs 31.9412μs 31.3075 KOps/s 31.6210 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[True-True-False-True-False] 74.6310μs 19.4252μs 51.4796 KOps/s 51.5368 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[True-True-False-False-True] 68.3280μs 18.6770μs 53.5418 KOps/s 53.8073 KOps/s $\color{#d91a1a}-0.49\%$
test_step_mdp_speed[True-True-False-False-False] 38.6830μs 11.7047μs 85.4357 KOps/s 86.1531 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[True-False-True-True-True] 0.1103ms 33.8828μs 29.5135 KOps/s 29.9561 KOps/s $\color{#d91a1a}-1.48\%$
test_step_mdp_speed[True-False-True-True-False] 58.5300μs 21.2333μs 47.0958 KOps/s 47.2921 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[True-False-True-False-True] 95.9410μs 18.5980μs 53.7692 KOps/s 54.1433 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[True-False-True-False-False] 34.6250μs 11.6872μs 85.5635 KOps/s 85.7530 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[True-False-False-True-True] 0.1095ms 35.4135μs 28.2378 KOps/s 28.2893 KOps/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[True-False-False-True-False] 85.5610μs 23.2632μs 42.9864 KOps/s 43.4631 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[True-False-False-False-True] 50.8460μs 20.4580μs 48.8806 KOps/s 49.2558 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[True-False-False-False-False] 74.7710μs 13.4734μs 74.2201 KOps/s 74.3307 KOps/s $\color{#d91a1a}-0.15\%$
test_step_mdp_speed[False-True-True-True-True] 80.6820μs 33.8528μs 29.5397 KOps/s 29.4705 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[False-True-True-True-False] 72.2360μs 21.4289μs 46.6660 KOps/s 46.8068 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[False-True-True-False-True] 55.4350μs 21.2319μs 47.0989 KOps/s 46.8914 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[False-True-True-False-False] 62.2670μs 12.9199μs 77.4002 KOps/s 76.9548 KOps/s $\color{#35bf28}+0.58\%$
test_step_mdp_speed[False-True-False-True-True] 78.9090μs 35.6297μs 28.0665 KOps/s 28.1552 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[False-True-False-True-False] 72.3070μs 23.1054μs 43.2798 KOps/s 42.9675 KOps/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[False-True-False-False-True] 3.4009ms 23.3020μs 42.9147 KOps/s 43.9333 KOps/s $\color{#d91a1a}-2.32\%$
test_step_mdp_speed[False-True-False-False-False] 69.6420μs 14.7649μs 67.7283 KOps/s 67.6567 KOps/s $\color{#35bf28}+0.11\%$
test_step_mdp_speed[False-False-True-True-True] 98.7260μs 37.8643μs 26.4101 KOps/s 26.9486 KOps/s $\color{#d91a1a}-2.00\%$
test_step_mdp_speed[False-False-True-True-False] 54.6230μs 24.8254μs 40.2813 KOps/s 40.5015 KOps/s $\color{#d91a1a}-0.54\%$
test_step_mdp_speed[False-False-True-False-True] 83.5170μs 22.9648μs 43.5450 KOps/s 44.9739 KOps/s $\color{#d91a1a}-3.18\%$
test_step_mdp_speed[False-False-True-False-False] 44.2840μs 14.7802μs 67.6581 KOps/s 68.3779 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[False-False-False-True-True] 95.9200μs 38.9197μs 25.6939 KOps/s 25.9624 KOps/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[False-False-False-True-False] 87.7750μs 26.5656μs 37.6427 KOps/s 38.3544 KOps/s $\color{#d91a1a}-1.86\%$
test_step_mdp_speed[False-False-False-False-True] 53.9320μs 24.4353μs 40.9244 KOps/s 42.1273 KOps/s $\color{#d91a1a}-2.86\%$
test_step_mdp_speed[False-False-False-False-False] 68.6500μs 16.4356μs 60.8435 KOps/s 61.2900 KOps/s $\color{#d91a1a}-0.73\%$
test_values[generalized_advantage_estimate-True-True] 10.6559ms 9.6777ms 103.3304 Ops/s 103.4822 Ops/s $\color{#d91a1a}-0.15\%$
test_values[vec_generalized_advantage_estimate-True-True] 38.4780ms 35.0055ms 28.5669 Ops/s 28.5303 Ops/s $\color{#35bf28}+0.13\%$
test_values[td0_return_estimate-False-False] 0.2749ms 0.1801ms 5.5518 KOps/s 5.5652 KOps/s $\color{#d91a1a}-0.24\%$
test_values[td1_return_estimate-False-False] 28.5797ms 24.4395ms 40.9174 Ops/s 41.6691 Ops/s $\color{#d91a1a}-1.80\%$
test_values[vec_td1_return_estimate-False-False] 36.6749ms 34.6427ms 28.8661 Ops/s 29.5515 Ops/s $\color{#d91a1a}-2.32\%$
test_values[td_lambda_return_estimate-True-False] 39.3550ms 34.7193ms 28.8024 Ops/s 29.0819 Ops/s $\color{#d91a1a}-0.96\%$
test_values[vec_td_lambda_return_estimate-True-False] 35.6959ms 33.7182ms 29.6575 Ops/s 29.4726 Ops/s $\color{#35bf28}+0.63\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.3381ms 8.3617ms 119.5929 Ops/s 120.4084 Ops/s $\color{#d91a1a}-0.68\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4310ms 2.0146ms 496.3728 Ops/s 505.2717 Ops/s $\color{#d91a1a}-1.76\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6152ms 0.3618ms 2.7641 KOps/s 2.8061 KOps/s $\color{#d91a1a}-1.50\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 47.9352ms 44.9178ms 22.2629 Ops/s 21.9895 Ops/s $\color{#35bf28}+1.24\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.8999ms 3.0546ms 327.3772 Ops/s 328.5722 Ops/s $\color{#d91a1a}-0.36\%$
test_dqn_speed[False-None] 1.8764ms 1.3753ms 727.1030 Ops/s 694.6375 Ops/s $\color{#35bf28}+4.67\%$
test_dqn_speed[False-backward] 2.2193ms 1.9073ms 524.2972 Ops/s 528.8032 Ops/s $\color{#d91a1a}-0.85\%$
test_dqn_speed[True-None] 0.7638ms 0.4730ms 2.1142 KOps/s 2.1055 KOps/s $\color{#35bf28}+0.41\%$
test_dqn_speed[True-backward] 0.9671ms 0.8828ms 1.1327 KOps/s 1.1142 KOps/s $\color{#35bf28}+1.66\%$
test_dqn_speed[reduce-overhead-None] 0.6128ms 0.4696ms 2.1295 KOps/s 2.1230 KOps/s $\color{#35bf28}+0.30\%$
test_dqn_speed[reduce-overhead-backward] 0.9632ms 0.8900ms 1.1236 KOps/s 1.0885 KOps/s $\color{#35bf28}+3.22\%$
test_ddpg_speed[False-None] 4.5693ms 2.8606ms 349.5753 Ops/s 347.6253 Ops/s $\color{#35bf28}+0.56\%$
test_ddpg_speed[False-backward] 4.1861ms 4.0181ms 248.8732 Ops/s 246.4834 Ops/s $\color{#35bf28}+0.97\%$
test_ddpg_speed[True-None] 1.4092ms 1.0046ms 995.4624 Ops/s 984.6559 Ops/s $\color{#35bf28}+1.10\%$
test_ddpg_speed[True-backward] 1.9631ms 1.8879ms 529.6761 Ops/s 523.5756 Ops/s $\color{#35bf28}+1.17\%$
test_ddpg_speed[reduce-overhead-None] 1.3726ms 0.9984ms 1.0016 KOps/s 990.4908 Ops/s $\color{#35bf28}+1.12\%$
test_ddpg_speed[reduce-overhead-backward] 2.0086ms 1.9006ms 526.1606 Ops/s 514.9551 Ops/s $\color{#35bf28}+2.18\%$
test_sac_speed[False-None] 8.9574ms 8.0784ms 123.7866 Ops/s 123.2516 Ops/s $\color{#35bf28}+0.43\%$
test_sac_speed[False-backward] 11.5997ms 10.8220ms 92.4047 Ops/s 92.5094 Ops/s $\color{#d91a1a}-0.11\%$
test_sac_speed[True-None] 1.9896ms 1.8383ms 543.9934 Ops/s 540.0311 Ops/s $\color{#35bf28}+0.73\%$
test_sac_speed[True-backward] 3.6104ms 3.5258ms 283.6249 Ops/s 278.4204 Ops/s $\color{#35bf28}+1.87\%$
test_sac_speed[reduce-overhead-None] 2.1210ms 1.8308ms 546.2230 Ops/s 539.3422 Ops/s $\color{#35bf28}+1.28\%$
test_sac_speed[reduce-overhead-backward] 3.9723ms 3.5570ms 281.1360 Ops/s 280.3063 Ops/s $\color{#35bf28}+0.30\%$
test_redq_speed[False-None] 15.7508ms 13.4294ms 74.4634 Ops/s 75.6080 Ops/s $\color{#d91a1a}-1.51\%$
test_redq_speed[False-backward] 25.9412ms 23.3638ms 42.8013 Ops/s 43.5709 Ops/s $\color{#d91a1a}-1.77\%$
test_redq_speed[True-None] 6.1007ms 4.6646ms 214.3789 Ops/s 213.5651 Ops/s $\color{#35bf28}+0.38\%$
test_redq_speed[True-backward] 13.9295ms 12.5878ms 79.4422 Ops/s 77.4707 Ops/s $\color{#35bf28}+2.54\%$
test_redq_speed[reduce-overhead-None] 5.4718ms 4.7861ms 208.9404 Ops/s 214.9389 Ops/s $\color{#d91a1a}-2.79\%$
test_redq_speed[reduce-overhead-backward] 13.5844ms 12.3464ms 80.9954 Ops/s 74.1264 Ops/s $\textbf{\color{#35bf28}+9.27\%}$
test_redq_deprec_speed[False-None] 15.1630ms 12.8804ms 77.6375 Ops/s 73.5699 Ops/s $\textbf{\color{#35bf28}+5.53\%}$
test_redq_deprec_speed[False-backward] 20.7248ms 18.8828ms 52.9583 Ops/s 51.8723 Ops/s $\color{#35bf28}+2.09\%$
test_redq_deprec_speed[True-None] 4.0568ms 3.5891ms 278.6226 Ops/s 271.5445 Ops/s $\color{#35bf28}+2.61\%$
test_redq_deprec_speed[True-backward] 8.8159ms 8.0726ms 123.8764 Ops/s 120.2558 Ops/s $\color{#35bf28}+3.01\%$
test_redq_deprec_speed[reduce-overhead-None] 4.1873ms 3.6229ms 276.0257 Ops/s 277.0646 Ops/s $\color{#d91a1a}-0.37\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.8274ms 8.0547ms 124.1514 Ops/s 120.0558 Ops/s $\color{#35bf28}+3.41\%$
test_td3_speed[False-None] 8.4928ms 8.0622ms 124.0352 Ops/s 123.9506 Ops/s $\color{#35bf28}+0.07\%$
test_td3_speed[False-backward] 12.0899ms 10.3681ms 96.4493 Ops/s 94.3966 Ops/s $\color{#35bf28}+2.17\%$
test_td3_speed[True-None] 1.9608ms 1.7112ms 584.3932 Ops/s 574.3696 Ops/s $\color{#35bf28}+1.75\%$
test_td3_speed[True-backward] 3.3820ms 3.3129ms 301.8494 Ops/s 292.6581 Ops/s $\color{#35bf28}+3.14\%$
test_td3_speed[reduce-overhead-None] 1.9588ms 1.7054ms 586.3739 Ops/s 572.7171 Ops/s $\color{#35bf28}+2.38\%$
test_td3_speed[reduce-overhead-backward] 3.3865ms 3.3065ms 302.4351 Ops/s 294.3090 Ops/s $\color{#35bf28}+2.76\%$
test_cql_speed[False-None] 40.3543ms 36.6698ms 27.2704 Ops/s 26.2362 Ops/s $\color{#35bf28}+3.94\%$
test_cql_speed[False-backward] 55.1260ms 48.0893ms 20.7947 Ops/s 20.9633 Ops/s $\color{#d91a1a}-0.80\%$
test_cql_speed[True-None] 17.9051ms 16.5029ms 60.5956 Ops/s 61.7526 Ops/s $\color{#d91a1a}-1.87\%$
test_cql_speed[True-backward] 25.6727ms 22.9829ms 43.5107 Ops/s 44.5752 Ops/s $\color{#d91a1a}-2.39\%$
test_cql_speed[reduce-overhead-None] 17.7900ms 16.3552ms 61.1426 Ops/s 62.2976 Ops/s $\color{#d91a1a}-1.85\%$
test_cql_speed[reduce-overhead-backward] 26.6119ms 24.3723ms 41.0302 Ops/s 42.8922 Ops/s $\color{#d91a1a}-4.34\%$
test_a2c_speed[False-None] 8.4990ms 7.2095ms 138.7063 Ops/s 136.7085 Ops/s $\color{#35bf28}+1.46\%$
test_a2c_speed[False-backward] 14.9827ms 14.2737ms 70.0591 Ops/s 67.2376 Ops/s $\color{#35bf28}+4.20\%$
test_a2c_speed[True-None] 5.0412ms 4.3456ms 230.1171 Ops/s 237.7178 Ops/s $\color{#d91a1a}-3.20\%$
test_a2c_speed[True-backward] 12.3904ms 11.3002ms 88.4943 Ops/s 89.5631 Ops/s $\color{#d91a1a}-1.19\%$
test_a2c_speed[reduce-overhead-None] 5.0236ms 4.3097ms 232.0324 Ops/s 235.9872 Ops/s $\color{#d91a1a}-1.68\%$
test_a2c_speed[reduce-overhead-backward] 11.9526ms 11.2473ms 88.9104 Ops/s 91.0682 Ops/s $\color{#d91a1a}-2.37\%$
test_ppo_speed[False-None] 9.2738ms 7.8736ms 127.0059 Ops/s 131.8147 Ops/s $\color{#d91a1a}-3.65\%$
test_ppo_speed[False-backward] 17.3725ms 15.6534ms 63.8838 Ops/s 66.6883 Ops/s $\color{#d91a1a}-4.21\%$
test_ppo_speed[True-None] 4.2918ms 3.7252ms 268.4441 Ops/s 269.8032 Ops/s $\color{#d91a1a}-0.50\%$
test_ppo_speed[True-backward] 10.3662ms 9.6107ms 104.0506 Ops/s 103.7798 Ops/s $\color{#35bf28}+0.26\%$
test_ppo_speed[reduce-overhead-None] 4.3952ms 3.7806ms 264.5110 Ops/s 262.6126 Ops/s $\color{#35bf28}+0.72\%$
test_ppo_speed[reduce-overhead-backward] 10.9755ms 9.6460ms 103.6699 Ops/s 103.0913 Ops/s $\color{#35bf28}+0.56\%$
test_reinforce_speed[False-None] 7.5602ms 6.5443ms 152.8054 Ops/s 151.5385 Ops/s $\color{#35bf28}+0.84\%$
test_reinforce_speed[False-backward] 10.9659ms 10.3213ms 96.8868 Ops/s 96.3869 Ops/s $\color{#35bf28}+0.52\%$
test_reinforce_speed[True-None] 2.9292ms 2.6292ms 380.3491 Ops/s 370.7822 Ops/s $\color{#35bf28}+2.58\%$
test_reinforce_speed[True-backward] 9.2756ms 8.6689ms 115.3545 Ops/s 111.0241 Ops/s $\color{#35bf28}+3.90\%$
test_reinforce_speed[reduce-overhead-None] 3.1324ms 2.6354ms 379.4429 Ops/s 370.9720 Ops/s $\color{#35bf28}+2.28\%$
test_reinforce_speed[reduce-overhead-backward] 9.5510ms 8.6014ms 116.2595 Ops/s 115.9054 Ops/s $\color{#35bf28}+0.31\%$
test_iql_speed[False-None] 33.7082ms 32.2031ms 31.0529 Ops/s 30.2793 Ops/s $\color{#35bf28}+2.55\%$
test_iql_speed[False-backward] 55.5074ms 46.8166ms 21.3600 Ops/s 21.8549 Ops/s $\color{#d91a1a}-2.26\%$
test_iql_speed[True-None] 11.8967ms 10.5973ms 94.3638 Ops/s 90.4909 Ops/s $\color{#35bf28}+4.28\%$
test_iql_speed[True-backward] 22.7616ms 21.7090ms 46.0639 Ops/s 44.6358 Ops/s $\color{#35bf28}+3.20\%$
test_iql_speed[reduce-overhead-None] 11.7354ms 10.9015ms 91.7305 Ops/s 89.3794 Ops/s $\color{#35bf28}+2.63\%$
test_iql_speed[reduce-overhead-backward] 22.7628ms 21.5527ms 46.3979 Ops/s 43.5799 Ops/s $\textbf{\color{#35bf28}+6.47\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.8980ms 5.0015ms 199.9392 Ops/s 195.8174 Ops/s $\color{#35bf28}+2.10\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8026ms 0.5125ms 1.9513 KOps/s 1.8662 KOps/s $\color{#35bf28}+4.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7249ms 0.4968ms 2.0131 KOps/s 1.9777 KOps/s $\color{#35bf28}+1.79\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.0897ms 4.8819ms 204.8388 Ops/s 202.7408 Ops/s $\color{#35bf28}+1.03\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.4354ms 0.5007ms 1.9972 KOps/s 1.9165 KOps/s $\color{#35bf28}+4.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.9488ms 0.4780ms 2.0922 KOps/s 2.0560 KOps/s $\color{#35bf28}+1.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9710ms 1.6397ms 609.8681 Ops/s 604.8006 Ops/s $\color{#35bf28}+0.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.8330ms 1.5895ms 629.1470 Ops/s 624.4930 Ops/s $\color{#35bf28}+0.75\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.4620ms 4.9381ms 202.5087 Ops/s 193.6359 Ops/s $\color{#35bf28}+4.58\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2052ms 0.6437ms 1.5536 KOps/s 1.4829 KOps/s $\color{#35bf28}+4.77\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0419ms 0.6199ms 1.6132 KOps/s 1.5676 KOps/s $\color{#35bf28}+2.91\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.2884ms 4.7598ms 210.0923 Ops/s 199.7338 Ops/s $\textbf{\color{#35bf28}+5.19\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8405ms 0.5171ms 1.9340 KOps/s 1.8906 KOps/s $\color{#35bf28}+2.29\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7674ms 0.4914ms 2.0348 KOps/s 1.9513 KOps/s $\color{#35bf28}+4.28\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3073ms 4.8700ms 205.3380 Ops/s 198.8432 Ops/s $\color{#35bf28}+3.27\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7943ms 0.4979ms 2.0084 KOps/s 1.9665 KOps/s $\color{#35bf28}+2.13\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7716ms 0.4788ms 2.0886 KOps/s 2.0271 KOps/s $\color{#35bf28}+3.04\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.4761ms 5.0768ms 196.9752 Ops/s 194.5408 Ops/s $\color{#35bf28}+1.25\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.2640ms 0.6528ms 1.5318 KOps/s 1.4907 KOps/s $\color{#35bf28}+2.76\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8391ms 0.6163ms 1.6225 KOps/s 1.5809 KOps/s $\color{#35bf28}+2.63\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.6089s 16.4953ms 60.6235 Ops/s 219.2994 Ops/s $\textbf{\color{#d91a1a}-72.36\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.5172ms 2.4028ms 416.1757 Ops/s 427.0694 Ops/s $\color{#d91a1a}-2.55\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.6306ms 1.4023ms 713.1016 Ops/s 788.6840 Ops/s $\textbf{\color{#d91a1a}-9.58\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.5333ms 4.2762ms 233.8515 Ops/s 249.6300 Ops/s $\textbf{\color{#d91a1a}-6.32\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.7015ms 2.3740ms 421.2215 Ops/s 449.9264 Ops/s $\textbf{\color{#d91a1a}-6.38\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.2848ms 1.3136ms 761.2941 Ops/s 725.5994 Ops/s $\color{#35bf28}+4.92\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4024s 12.4212ms 80.5077 Ops/s 238.6218 Ops/s $\textbf{\color{#d91a1a}-66.26\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 3.6319ms 2.3660ms 422.6577 Ops/s 344.0947 Ops/s $\textbf{\color{#35bf28}+22.83\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.1283ms 1.4140ms 707.2202 Ops/s 637.9632 Ops/s $\textbf{\color{#35bf28}+10.86\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.8489ms 11.3207ms 88.3340 Ops/s 85.6938 Ops/s $\color{#35bf28}+3.08\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.6823ms 15.1496ms 66.0084 Ops/s 64.9394 Ops/s $\color{#35bf28}+1.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.7462ms 19.8618ms 50.3479 Ops/s 48.9912 Ops/s $\color{#35bf28}+2.77\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.4144ms 15.3741ms 65.0445 Ops/s 63.8081 Ops/s $\color{#35bf28}+1.94\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.2436ms 19.6457ms 50.9017 Ops/s 48.6527 Ops/s $\color{#35bf28}+4.62\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.9551ms 16.5765ms 60.3264 Ops/s 58.1442 Ops/s $\color{#35bf28}+3.75\%$

Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7528s 0.7518s 1.3302 Ops/s 1.3164 Ops/s $\color{#35bf28}+1.05\%$
test_transformed 1.0141s 1.0104s 0.9897 Ops/s 0.9884 Ops/s $\color{#35bf28}+0.13\%$
test_serial 2.1626s 2.1608s 0.4628 Ops/s 0.4611 Ops/s $\color{#35bf28}+0.36\%$
test_parallel 2.1332s 2.0225s 0.4944 Ops/s 0.5085 Ops/s $\color{#d91a1a}-2.76\%$
test_step_mdp_speed[True-True-True-True-True] 0.1960ms 40.2431μs 24.8490 KOps/s 25.4530 KOps/s $\color{#d91a1a}-2.37\%$
test_step_mdp_speed[True-True-True-True-False] 57.3930μs 23.4521μs 42.6401 KOps/s 43.7728 KOps/s $\color{#d91a1a}-2.59\%$
test_step_mdp_speed[True-True-True-False-True] 49.8920μs 22.2571μs 44.9295 KOps/s 46.4645 KOps/s $\color{#d91a1a}-3.30\%$
test_step_mdp_speed[True-True-True-False-False] 36.7820μs 12.6594μs 78.9926 KOps/s 78.7802 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[True-True-False-True-True] 92.7040μs 42.7540μs 23.3896 KOps/s 23.7330 KOps/s $\color{#d91a1a}-1.45\%$
test_step_mdp_speed[True-True-False-True-False] 52.9530μs 25.2729μs 39.5681 KOps/s 40.4986 KOps/s $\color{#d91a1a}-2.30\%$
test_step_mdp_speed[True-True-False-False-True] 62.8530μs 24.0191μs 41.6336 KOps/s 43.6920 KOps/s $\color{#d91a1a}-4.71\%$
test_step_mdp_speed[True-True-False-False-False] 44.7120μs 15.0255μs 66.5535 KOps/s 67.3882 KOps/s $\color{#d91a1a}-1.24\%$
test_step_mdp_speed[True-False-True-True-True] 82.2540μs 44.6466μs 22.3981 KOps/s 22.5604 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[True-False-True-True-False] 55.9130μs 27.0169μs 37.0138 KOps/s 37.2815 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[True-False-True-False-True] 66.1430μs 23.9347μs 41.7804 KOps/s 41.5509 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[True-False-True-False-False] 53.7920μs 14.8453μs 67.3614 KOps/s 67.5893 KOps/s $\color{#d91a1a}-0.34\%$
test_step_mdp_speed[True-False-False-True-True] 84.6640μs 46.5197μs 21.4962 KOps/s 21.8257 KOps/s $\color{#d91a1a}-1.51\%$
test_step_mdp_speed[True-False-False-True-False] 63.0130μs 29.2568μs 34.1801 KOps/s 34.5143 KOps/s $\color{#d91a1a}-0.97\%$
test_step_mdp_speed[True-False-False-False-True] 57.1730μs 26.3997μs 37.8793 KOps/s 38.2464 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[True-False-False-False-False] 55.8830μs 17.1042μs 58.4652 KOps/s 59.4641 KOps/s $\color{#d91a1a}-1.68\%$
test_step_mdp_speed[False-True-True-True-True] 79.4940μs 45.2740μs 22.0877 KOps/s 23.1633 KOps/s $\color{#d91a1a}-4.64\%$
test_step_mdp_speed[False-True-True-True-False] 59.0730μs 27.1977μs 36.7679 KOps/s 37.8738 KOps/s $\color{#d91a1a}-2.92\%$
test_step_mdp_speed[False-True-True-False-True] 61.6630μs 28.2962μs 35.3404 KOps/s 35.5281 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[False-True-True-False-False] 40.6320μs 16.5230μs 60.5218 KOps/s 60.8204 KOps/s $\color{#d91a1a}-0.49\%$
test_step_mdp_speed[False-True-False-True-True] 82.5740μs 46.4382μs 21.5340 KOps/s 21.6333 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[False-True-False-True-False] 59.6330μs 29.4152μs 33.9960 KOps/s 34.5357 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[False-True-False-False-True] 3.2217ms 31.2224μs 32.0283 KOps/s 33.2612 KOps/s $\color{#d91a1a}-3.71\%$
test_step_mdp_speed[False-True-False-False-False] 52.8830μs 18.8108μs 53.1608 KOps/s 53.8725 KOps/s $\color{#d91a1a}-1.32\%$
test_step_mdp_speed[False-False-True-True-True] 0.1056ms 49.7882μs 20.0851 KOps/s 21.0367 KOps/s $\color{#d91a1a}-4.52\%$
test_step_mdp_speed[False-False-True-True-False] 56.3120μs 31.5221μs 31.7238 KOps/s 31.8833 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[False-False-True-False-True] 59.5930μs 30.1982μs 33.1146 KOps/s 33.2276 KOps/s $\color{#d91a1a}-0.34\%$
test_step_mdp_speed[False-False-True-False-False] 44.5820μs 18.6741μs 53.5500 KOps/s 53.9428 KOps/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[False-False-False-True-True] 76.7540μs 49.2670μs 20.2976 KOps/s 20.1841 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[False-False-False-True-False] 63.4030μs 33.7164μs 29.6592 KOps/s 30.2029 KOps/s $\color{#d91a1a}-1.80\%$
test_step_mdp_speed[False-False-False-False-True] 57.8420μs 31.4545μs 31.7920 KOps/s 32.0466 KOps/s $\color{#d91a1a}-0.79\%$
test_step_mdp_speed[False-False-False-False-False] 46.6620μs 20.3926μs 49.0374 KOps/s 48.5570 KOps/s $\color{#35bf28}+0.99\%$
test_values[generalized_advantage_estimate-True-True] 26.9475ms 25.9033ms 38.6051 Ops/s 38.5121 Ops/s $\color{#35bf28}+0.24\%$
test_values[vec_generalized_advantage_estimate-True-True] 94.3459ms 2.7926ms 358.0934 Ops/s 358.9892 Ops/s $\color{#d91a1a}-0.25\%$
test_values[td0_return_estimate-False-False] 0.1146ms 83.3342μs 11.9999 KOps/s 11.9256 KOps/s $\color{#35bf28}+0.62\%$
test_values[td1_return_estimate-False-False] 58.3416ms 56.6868ms 17.6408 Ops/s 17.5597 Ops/s $\color{#35bf28}+0.46\%$
test_values[vec_td1_return_estimate-False-False] 1.3910ms 1.0985ms 910.2954 Ops/s 907.4938 Ops/s $\color{#35bf28}+0.31\%$
test_values[td_lambda_return_estimate-True-False] 90.6916ms 90.0170ms 11.1090 Ops/s 10.8811 Ops/s $\color{#35bf28}+2.09\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4052ms 1.0935ms 914.4646 Ops/s 899.7194 Ops/s $\color{#35bf28}+1.64\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 26.5683ms 26.0050ms 38.4541 Ops/s 37.5468 Ops/s $\color{#35bf28}+2.42\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0810ms 0.7826ms 1.2778 KOps/s 1.2789 KOps/s $\color{#d91a1a}-0.09\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7805ms 0.6810ms 1.4685 KOps/s 1.4561 KOps/s $\color{#35bf28}+0.85\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5566ms 1.4889ms 671.6305 Ops/s 666.9324 Ops/s $\color{#35bf28}+0.70\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7734ms 0.7054ms 1.4176 KOps/s 1.4237 KOps/s $\color{#d91a1a}-0.43\%$
test_dqn_speed[False-None] 7.0398ms 1.5521ms 644.3025 Ops/s 662.0674 Ops/s $\color{#d91a1a}-2.68\%$
test_dqn_speed[False-backward] 2.2155ms 2.1566ms 463.6823 Ops/s 468.5407 Ops/s $\color{#d91a1a}-1.04\%$
test_dqn_speed[True-None] 0.6579ms 0.5529ms 1.8088 KOps/s 1.8541 KOps/s $\color{#d91a1a}-2.44\%$
test_dqn_speed[True-backward] 1.2936ms 1.2098ms 826.5568 Ops/s 905.6942 Ops/s $\textbf{\color{#d91a1a}-8.74\%}$
test_dqn_speed[reduce-overhead-None] 0.6606ms 0.5717ms 1.7490 KOps/s 1.7295 KOps/s $\color{#35bf28}+1.13\%$
test_dqn_speed[reduce-overhead-backward] 1.0996ms 1.0540ms 948.7834 Ops/s 1.0290 KOps/s $\textbf{\color{#d91a1a}-7.80\%}$
test_ddpg_speed[False-None] 3.1988ms 2.8666ms 348.8412 Ops/s 347.1569 Ops/s $\color{#35bf28}+0.49\%$
test_ddpg_speed[False-backward] 4.6842ms 4.2308ms 236.3598 Ops/s 240.1556 Ops/s $\color{#d91a1a}-1.58\%$
test_ddpg_speed[True-None] 1.1677ms 1.0634ms 940.4212 Ops/s 928.9976 Ops/s $\color{#35bf28}+1.23\%$
test_ddpg_speed[True-backward] 2.3531ms 2.2720ms 440.1430 Ops/s 435.3109 Ops/s $\color{#35bf28}+1.11\%$
test_ddpg_speed[reduce-overhead-None] 1.1429ms 1.0729ms 932.0755 Ops/s 916.5636 Ops/s $\color{#35bf28}+1.69\%$
test_ddpg_speed[reduce-overhead-backward] 1.8415ms 1.7705ms 564.8188 Ops/s 600.6305 Ops/s $\textbf{\color{#d91a1a}-5.96\%}$
test_sac_speed[False-None] 8.5024ms 8.0913ms 123.5898 Ops/s 123.0100 Ops/s $\color{#35bf28}+0.47\%$
test_sac_speed[False-backward] 11.7080ms 11.3416ms 88.1709 Ops/s 89.4735 Ops/s $\color{#d91a1a}-1.46\%$
test_sac_speed[True-None] 1.6190ms 1.5154ms 659.9038 Ops/s 648.0774 Ops/s $\color{#35bf28}+1.82\%$
test_sac_speed[True-backward] 3.5121ms 3.4000ms 294.1162 Ops/s 291.8512 Ops/s $\color{#35bf28}+0.78\%$
test_sac_speed[reduce-overhead-None] 23.4468ms 12.7227ms 78.5995 Ops/s 78.0747 Ops/s $\color{#35bf28}+0.67\%$
test_sac_speed[reduce-overhead-backward] 1.6018ms 1.4939ms 669.3674 Ops/s 655.8573 Ops/s $\color{#35bf28}+2.06\%$
test_redq_speed[False-None] 8.6793ms 7.6479ms 130.7547 Ops/s 130.2548 Ops/s $\color{#35bf28}+0.38\%$
test_redq_speed[False-backward] 12.6737ms 11.8399ms 84.4601 Ops/s 84.0187 Ops/s $\color{#35bf28}+0.53\%$
test_redq_speed[True-None] 2.1818ms 2.0164ms 495.9357 Ops/s 496.9791 Ops/s $\color{#d91a1a}-0.21\%$
test_redq_speed[True-backward] 3.8660ms 3.7090ms 269.6109 Ops/s 270.8046 Ops/s $\color{#d91a1a}-0.44\%$
test_redq_speed[reduce-overhead-None] 2.1753ms 2.0127ms 496.8513 Ops/s 491.7993 Ops/s $\color{#35bf28}+1.03\%$
test_redq_speed[reduce-overhead-backward] 4.0318ms 3.6520ms 273.8196 Ops/s 258.1608 Ops/s $\textbf{\color{#35bf28}+6.07\%}$
test_redq_deprec_speed[False-None] 9.8069ms 9.1991ms 108.7063 Ops/s 108.4112 Ops/s $\color{#35bf28}+0.27\%$
test_redq_deprec_speed[False-backward] 12.9459ms 12.2854ms 81.3973 Ops/s 79.1769 Ops/s $\color{#35bf28}+2.80\%$
test_redq_deprec_speed[True-None] 2.5066ms 2.3498ms 425.5669 Ops/s 424.3373 Ops/s $\color{#35bf28}+0.29\%$
test_redq_deprec_speed[True-backward] 4.5524ms 4.0631ms 246.1195 Ops/s 247.7403 Ops/s $\color{#d91a1a}-0.65\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4068ms 2.3225ms 430.5746 Ops/s 427.6937 Ops/s $\color{#35bf28}+0.67\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.4208ms 4.0259ms 248.3942 Ops/s 247.7113 Ops/s $\color{#35bf28}+0.28\%$
test_td3_speed[False-None] 8.2282ms 7.9659ms 125.5351 Ops/s 124.9689 Ops/s $\color{#35bf28}+0.45\%$
test_td3_speed[False-backward] 11.0257ms 10.3956ms 96.1948 Ops/s 96.4331 Ops/s $\color{#d91a1a}-0.25\%$
test_td3_speed[True-None] 1.6450ms 1.5789ms 633.3336 Ops/s 621.8602 Ops/s $\color{#35bf28}+1.85\%$
test_td3_speed[True-backward] 3.2409ms 3.1071ms 321.8458 Ops/s 318.3139 Ops/s $\color{#35bf28}+1.11\%$
test_td3_speed[reduce-overhead-None] 82.9908ms 26.0922ms 38.3257 Ops/s 37.3131 Ops/s $\color{#35bf28}+2.71\%$
test_td3_speed[reduce-overhead-backward] 1.3584ms 1.2870ms 777.0103 Ops/s 757.9297 Ops/s $\color{#35bf28}+2.52\%$
test_cql_speed[False-None] 17.4282ms 16.8671ms 59.2871 Ops/s 58.6649 Ops/s $\color{#35bf28}+1.06\%$
test_cql_speed[False-backward] 22.9347ms 22.1271ms 45.1934 Ops/s 44.9775 Ops/s $\color{#35bf28}+0.48\%$
test_cql_speed[True-None] 3.0547ms 2.9192ms 342.5558 Ops/s 338.9136 Ops/s $\color{#35bf28}+1.07\%$
test_cql_speed[True-backward] 5.4901ms 5.0656ms 197.4110 Ops/s 185.7570 Ops/s $\textbf{\color{#35bf28}+6.27\%}$
test_cql_speed[reduce-overhead-None] 21.8863ms 13.3280ms 75.0300 Ops/s 75.2248 Ops/s $\color{#d91a1a}-0.26\%$
test_cql_speed[reduce-overhead-backward] 1.6576ms 1.5131ms 660.8745 Ops/s 587.4958 Ops/s $\textbf{\color{#35bf28}+12.49\%}$
test_a2c_speed[False-None] 3.3312ms 3.2155ms 310.9976 Ops/s 298.2724 Ops/s $\color{#35bf28}+4.27\%$
test_a2c_speed[False-backward] 6.6639ms 6.2399ms 160.2597 Ops/s 154.1760 Ops/s $\color{#35bf28}+3.95\%$
test_a2c_speed[True-None] 1.2254ms 1.0317ms 969.3055 Ops/s 987.5985 Ops/s $\color{#d91a1a}-1.85\%$
test_a2c_speed[True-backward] 2.6592ms 2.5842ms 386.9652 Ops/s 357.6313 Ops/s $\textbf{\color{#35bf28}+8.20\%}$
test_a2c_speed[reduce-overhead-None] 21.8041ms 11.7491ms 85.1130 Ops/s 86.8362 Ops/s $\color{#d91a1a}-1.98\%$
test_a2c_speed[reduce-overhead-backward] 1.0258ms 0.9671ms 1.0341 KOps/s 869.9048 Ops/s $\textbf{\color{#35bf28}+18.87\%}$
test_ppo_speed[False-None] 3.9922ms 3.7305ms 268.0629 Ops/s 268.4651 Ops/s $\color{#d91a1a}-0.15\%$
test_ppo_speed[False-backward] 7.3194ms 6.9198ms 144.5121 Ops/s 139.4942 Ops/s $\color{#35bf28}+3.60\%$
test_ppo_speed[True-None] 1.0303ms 0.9463ms 1.0567 KOps/s 1.0545 KOps/s $\color{#35bf28}+0.22\%$
test_ppo_speed[True-backward] 2.6550ms 2.5267ms 395.7722 Ops/s 386.6456 Ops/s $\color{#35bf28}+2.36\%$
test_ppo_speed[reduce-overhead-None] 0.7679ms 0.5017ms 1.9932 KOps/s 1.8757 KOps/s $\textbf{\color{#35bf28}+6.27\%}$
test_ppo_speed[reduce-overhead-backward] 1.0008ms 0.9573ms 1.0446 KOps/s 997.4947 Ops/s $\color{#35bf28}+4.72\%$
test_reinforce_speed[False-None] 2.5412ms 2.2738ms 439.7932 Ops/s 438.0857 Ops/s $\color{#35bf28}+0.39\%$
test_reinforce_speed[False-backward] 3.7503ms 3.2929ms 303.6812 Ops/s 300.8938 Ops/s $\color{#35bf28}+0.93\%$
test_reinforce_speed[True-None] 0.8765ms 0.8150ms 1.2270 KOps/s 1.2023 KOps/s $\color{#35bf28}+2.05\%$
test_reinforce_speed[True-backward] 2.5025ms 2.4004ms 416.5932 Ops/s 408.0478 Ops/s $\color{#35bf28}+2.09\%$
test_reinforce_speed[reduce-overhead-None] 23.1067ms 11.9502ms 83.6807 Ops/s 86.8509 Ops/s $\color{#d91a1a}-3.65\%$
test_reinforce_speed[reduce-overhead-backward] 1.0684ms 1.0303ms 970.5777 Ops/s 921.7141 Ops/s $\textbf{\color{#35bf28}+5.30\%}$
test_iql_speed[False-None] 9.8629ms 9.3620ms 106.8148 Ops/s 107.0436 Ops/s $\color{#d91a1a}-0.21\%$
test_iql_speed[False-backward] 13.8832ms 13.1895ms 75.8178 Ops/s 76.0917 Ops/s $\color{#d91a1a}-0.36\%$
test_iql_speed[True-None] 1.9258ms 1.8106ms 552.3002 Ops/s 573.6920 Ops/s $\color{#d91a1a}-3.73\%$
test_iql_speed[True-backward] 4.5637ms 4.2439ms 235.6302 Ops/s 232.3424 Ops/s $\color{#35bf28}+1.42\%$
test_iql_speed[reduce-overhead-None] 15.6000ms 9.1907ms 108.8056 Ops/s 87.1124 Ops/s $\textbf{\color{#35bf28}+24.90\%}$
test_iql_speed[reduce-overhead-backward] 1.6447ms 1.5982ms 625.6941 Ops/s 622.0420 Ops/s $\color{#35bf28}+0.59\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8607ms 6.4050ms 156.1277 Ops/s 152.2286 Ops/s $\color{#35bf28}+2.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6412ms 0.3615ms 2.7662 KOps/s 2.8109 KOps/s $\color{#d91a1a}-1.59\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5462ms 0.3226ms 3.0997 KOps/s 2.9572 KOps/s $\color{#35bf28}+4.82\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5233ms 6.1621ms 162.2814 Ops/s 159.3717 Ops/s $\color{#35bf28}+1.83\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0801ms 0.3423ms 2.9210 KOps/s 3.1918 KOps/s $\textbf{\color{#d91a1a}-8.48\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5553ms 0.3276ms 3.0525 KOps/s 3.4409 KOps/s $\textbf{\color{#d91a1a}-11.29\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6611ms 1.4376ms 695.5853 Ops/s 701.8054 Ops/s $\color{#d91a1a}-0.89\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5602ms 1.2872ms 776.8563 Ops/s 813.7836 Ops/s $\color{#d91a1a}-4.54\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5491ms 6.3612ms 157.2023 Ops/s 155.9072 Ops/s $\color{#35bf28}+0.83\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2102ms 0.4937ms 2.0254 KOps/s 2.1389 KOps/s $\textbf{\color{#d91a1a}-5.31\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7661ms 0.4400ms 2.2725 KOps/s 2.2472 KOps/s $\color{#35bf28}+1.13\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4317ms 6.2345ms 160.3984 Ops/s 159.0512 Ops/s $\color{#35bf28}+0.85\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.8531ms 0.3945ms 2.5347 KOps/s 3.1166 KOps/s $\textbf{\color{#d91a1a}-18.67\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4715ms 0.2806ms 3.5641 KOps/s 3.3540 KOps/s $\textbf{\color{#35bf28}+6.26\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4378ms 6.1128ms 163.5909 Ops/s 160.3071 Ops/s $\color{#35bf28}+2.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7475ms 0.2613ms 3.8263 KOps/s 2.8223 KOps/s $\textbf{\color{#35bf28}+35.58\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4436ms 0.2413ms 4.1441 KOps/s 4.1544 KOps/s $\color{#d91a1a}-0.25\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5758ms 6.3319ms 157.9307 Ops/s 155.7548 Ops/s $\color{#35bf28}+1.40\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0342ms 0.5055ms 1.9781 KOps/s 2.1143 KOps/s $\textbf{\color{#d91a1a}-6.44\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6902ms 0.4661ms 2.1454 KOps/s 2.2117 KOps/s $\color{#d91a1a}-3.00\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.8844ms 5.2652ms 189.9245 Ops/s 187.8567 Ops/s $\color{#35bf28}+1.10\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.8097ms 1.9259ms 519.2487 Ops/s 435.9245 Ops/s $\textbf{\color{#35bf28}+19.11\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.9203ms 1.2557ms 796.3984 Ops/s 865.8116 Ops/s $\textbf{\color{#d91a1a}-8.02\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.5423ms 5.3623ms 186.4885 Ops/s 189.9884 Ops/s $\color{#d91a1a}-1.84\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.4997s 11.9678ms 83.5576 Ops/s 431.8743 Ops/s $\textbf{\color{#d91a1a}-80.65\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.5639ms 1.1964ms 835.8363 Ops/s 852.5554 Ops/s $\color{#d91a1a}-1.96\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.3638ms 5.5625ms 179.7763 Ops/s 32.8088 Ops/s $\textbf{\color{#35bf28}+447.95\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.1984ms 2.1895ms 456.7276 Ops/s 468.8403 Ops/s $\color{#d91a1a}-2.58\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.6818ms 1.3970ms 715.8069 Ops/s 712.6977 Ops/s $\color{#35bf28}+0.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.4964ms 13.0825ms 76.4378 Ops/s 72.9898 Ops/s $\color{#35bf28}+4.72\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.2506ms 17.2353ms 58.0203 Ops/s 56.9020 Ops/s $\color{#35bf28}+1.97\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.2904ms 17.3308ms 57.7007 Ops/s 55.3180 Ops/s $\color{#35bf28}+4.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.6704ms 17.4874ms 57.1839 Ops/s 55.9741 Ops/s $\color{#35bf28}+2.16\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.5545ms 17.3604ms 57.6022 Ops/s 55.4559 Ops/s $\color{#35bf28}+3.87\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.3674ms 18.7030ms 53.4674 Ops/s 52.8441 Ops/s $\color{#35bf28}+1.18\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit 10093ab into gh/vmoens/38/base Dec 14, 2024
50 of 64 checks passed
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: f18928a419f81794d6870fd4e9fe1205c1b137e1
Pull Request resolved: #2555
@vmoens vmoens deleted the gh/vmoens/38/head branch December 14, 2024 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants